Predicting the Predictability: A Unified Approach to the Applicability Domain Problem of QSAR Models

نویسندگان

  • Dragos Horvath
  • Gilles Marcou
  • Alexandre Varnek
چکیده

The present work proposes a unified conceptual framework to describe and quantify the important issue of the Applicability Domains (AD) of Quantitative Structure-Activity Relationships (QSARs). AD models are conceived as meta-models micromicro designed to associate an untrustworthiness score to any molecule M subject to property prediction by a QSAR model micro. Untrustworthiness scores or "AD metrics" Psimicro(M) are an expression of the relationship between M (represented by its descriptors in chemical space) and the space zones populated by the training molecules at the basis of model mu. Scores integrating some of the classical AD criteria (similarity-based, box-based) were considered in addition to newly invented terms such as the consensus prediction variance, the dissimilarity to outlier-free training sets, and the correlation breakdown count (the former two being most successful). A loose correlation is expected to exist between this untrustworthiness and the error |Pmicro(M)-Pexpt(M)| affecting the property Pmicro(M) predicted by micro. While high untrustworthiness does not preclude correct predictions, inaccurate predictions at low untrustworthiness must be imperatively avoided. This kind of relationship is characteristic for the Neighborhood Behavior (NB) problem: dissimilar molecule pairs may or may not display similar properties, but similar molecule pairs with different properties are explicitly "forbidden". Therefore, statistical tools developed to tackle this latter aspect were applied and lead to a unified AD metric benchmarking scheme. A first use of untrustworthiness scores resides in prioritization of predictions, without the need to specify a hard AD border. Moreover, if a significant set of external compounds is available, the formalism allows optimal AD borderlines to be fitted. Eventually, consensus AD definitions were built by means of a nonparametric mixing scheme of two AD metrics of comparable quality and shown to outperform their respective parents.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Quantitative Structure Activity Relationship Analysis of Coumarins as Free Radical Scavengers by Genetic Function Algorithm

The antioxidant properties of coumarin derivatives using the 2,2ˈ -diphenyl-1- picrylhydrazyl (DPPH) radical scavenging assay were investigated by the application of Quantitative Structure Activity Relationship (QSAR) studies. The molecular structures were optimized and submitted for the generation of quantum chemical and molecular descriptors. Genetic Function Algorithm (GFA) was employed in m...

متن کامل

QSAR Modeling of COX-2 Inhibitory Activity of Some Dihydropyridine and Hydroquinoline Derivatives Using Multiple Linear Regression (MLR) Method

COX-2 inhibitory activities of some 1,4-dihydropyridine and 5-oxo-1,4,5,6,7,8-hexahydroquinoline derivatives were modeled by quantitative structure–activity relationship (QSAR) using stepwise-multiple linear regression (SW-MLR) method. The built model was robust and predictive with correlation coefficient (R2) of 0.972 and 0.531 for training and test groups, respectively. The quality of the mod...

متن کامل

QSAR Modeling of COX-2 Inhibitory Activity of Some Dihydropyridine and Hydroquinoline Derivatives Using Multiple Linear Regression (MLR) Method

COX-2 inhibitory activities of some 1,4-dihydropyridine and 5-oxo-1,4,5,6,7,8-hexahydroquinoline derivatives were modeled by quantitative structure–activity relationship (QSAR) using stepwise-multiple linear regression (SW-MLR) method. The built model was robust and predictive with correlation coefficient (R2) of 0.972 and 0.531 for training and test groups, respectively. The quality of the mod...

متن کامل

In-silico prediction of Cellular Responses to Polymeric Biomaterials from Their Molecular Descriptors

In this work quantitative structure activity relationship (QSAR) methodology was applied for modeling and prediction of cellular response to polymers that have been designed for tissue engineering. After calculation and screening of molecular descriptors, linear and nonlinear models were developed by using multiple linear regressions (MLR) and artificial neural network (ANN) methods. The root m...

متن کامل

A unified approach to the applicability domain problem of QSAR models

The present work proposes a unified conceptual framework to describe and quantify the important issue of the Applicability Domains (AD) of Quantitative StructureActivity Relationships (QSARs). AD models are conceived as meta-models designed to associate an untrustworthiness score to any molecule M subject to property prediction by a QSAR model. Untrustworthiness scores or “AD metrics” are an ex...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of chemical information and modeling

دوره 49 7  شماره 

صفحات  -

تاریخ انتشار 2009